In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display
%matplotlib inline
plt.style.use('fivethirtyeight')
path = 'data/jhu_wifi/WigleWifi_20150911210322.csv'
In [2]:
df = pd.read_csv(path, encoding = "ISO-8859-1")
df.head()
Out[2]:
Now we have the data loaded nicely into a Pandas dataframe and we can look at some of the basics of the data.
In [3]:
display('Number of rows: {}'.format(len(df)))
display('Unique SSIDs: {}'.format(len(df['SSID'].unique())))
display('Unique MACs: {}'.format(len(df['MAC'].unique())))
display('Number of Auth Mode types: {}'.format(len(df['AuthMode'].unique())))
In [4]:
def auth_filter(x):
if 'WPA2' in x:
return 'WPA2'
elif 'WPA' in x:
return 'WPA'
elif 'WEP' in x:
return 'WEP'
else:
return 'OPEN'
df['AuthMode'].apply(auth_filter).value_counts().plot(kind='barh')
Out[4]:
So there are a significant number of open networks, but the overall majority use WPA2. That's good for the University but not so great for attackers. Of course, there could be a way around that via WPS. How many networks use that?
In [5]:
def wps(x):
if 'WPS' in x:
return 'WPS'
else:
return 'Not WPS'
df['AuthMode'].apply(wps).value_counts().plot(kind='barh')
Out[5]:
Over 500 networks use WPS! Using a tool like Reaver an attacker could easily breach those networks.
This is just some basic insights into the data. We could look further into the different forms of WPA/WPA2 authentication, but for an attacker these insights are enough. Using the above function for extracting WPS networks, an attacker could determine the locations of those networks and mount an attack on each of them.
In [6]:
s = df['AuthMode'].apply(wps)
wps_entries = df.ix[s[s == 'WPS'].index]
wps_entries.head()
Out[6]: